Preserving Discourse Structure when Simplifying Text

نویسنده

  • Advaith Siddharthan
چکیده

Text simplification involves restructuring sentences by replacing particular syntactic constructs (like embedded clauses and appositives). The aim is to make the text easier to read for some target group (like aphasics and people with low reading ages) or easier to process by some program (like a parser or machine translation system). However, sentencelevel syntactic restructuring can wreak havoc with the discourse structure of a text, actually making it harder to comprehend, and possibly even altering its meaning. In this paper, we present and evaluate techniques for detecting and correcting disruptions in discourse structure caused by syntactic restructuring. In particular, we look at the issues of preserving the rhetorical relationships between the original clauses and phrases and preserving the anaphoric link structure of the text.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Noisy-Channel Model for Document Compression

We present a document compression system that uses a hierarchical noisy-channel model of text production. Our compression system first automatically derives the syntactic structure of each sentence and the overall discourse structure of the text given as input. The system then uses a statistical hierarchical model of text production in order to drop non-important syntactic and discourse constit...

متن کامل

The Prosody of Discourse Structure and Content in the Production of Persian EFL Learners

The present research addressed the prosodic realization of global and local text structure and content in the spoken discourse data produced by Persian EFL learners. Two newspaper articles were analyzed using Rhetorical Structure Theory. Based on these analyses, the global structure in terms of hierarchical level, the local structure in terms of the relative importance of text segments and the ...

متن کامل

Simplifying metaphorical language for young readers: A corpus study on news text

The paper presents first results of an ongoing project on text simplification focusing on linguistic metaphors. Based on an analysis of a parallel corpus of news text professionally simplified for different grade levels, we identify six types of simplification choices falling into two broad categories: preserving metaphors or dropping them. An annotation study on almost 300 source sentences wit...

متن کامل

A Novel Method for Automatically Generating Multi-Modal Dialogue from Text

In this article, we propose a novel method for generating engaging multi-modal content automatically from text. Rhetorical Structure Theory (RST) is used to decompose text into discourse units and to identify rhetorical discourse relations between them. Rhetorical relations are then mapped to question–answer pairs in an information preserving way, i.e., the original text and the resulting dialo...

متن کامل

Robust Text Analysis via Underspecification

This paper is concerned with the robust analysis of the discourse structure of a text via underspecification. Most current discourse theories (e.g. Rhetorical Structure Theory (RST) by Mann and Thompson (1988), Abduction by Hobbs et al. (1993) or Segmented Discourse Representation Theory (SDRT) by Asher (1993)) require detailed world and context knowledge for the derivation of the discourse str...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003